Goto

Collaborating Authors

 ridgelet transform


Universality of Group Convolutional Neural Networks Based on Ridgelet Analysis on Groups

Neural Information Processing Systems

We show the universality of depth-2 group convolutional neural networks (GCNNs) in a unified and constructive manner based on the ridgelet theory. Despite widespread use in applications, the approximation property of (G)CNNs has not been well investigated. The universality of (G)CNNs has been shown since the late 2010s. Yet, our understanding on how (G)CNNs represent functions is incomplete because the past universality theorems have been shown in a case-by-case manner by manually/carefully assigning the network parameters depending on the variety of convolution layers, and in an indirect manner by converting/modifying the (G)CNNs into other universal approximators such as invariant polynomials and fully-connected networks. In this study, we formulate a versatile depth-2 continuous GCNN $S[\gamma]$ as a nonlinear mapping between group representations, and directly obtain an analysis operator, called the ridgelet trasform, that maps a given function $f$ to the network parameter $\gamma$ so that $S[\gamma]=f$.






Universality of Group Convolutional Neural Networks Based on Ridgelet Analysis on Groups

Neural Information Processing Systems

We show the universality of depth-2 group convolutional neural networks (GCNNs) in a unified and constructive manner based on the ridgelet theory. Despite widespread use in applications, the approximation property of (G)CNNs has not been well investigated. The universality of (G)CNNs has been shown since the late 2010s. Yet, our understanding on how (G)CNNs represent functions is incomplete because the past universality theorems have been shown in a case-by-case manner by manually/carefully assigning the network parameters depending on the variety of convolution layers, and in an indirect manner by converting/modifying the (G)CNNs into other universal approximators such as invariant polynomials and fully-connected networks. In this study, we formulate a versatile depth-2 continuous GCNN S[\gamma] as a nonlinear mapping between group representations, and directly obtain an analysis operator, called the ridgelet trasform, that maps a given function f to the network parameter \gamma so that S[\gamma] f .


Aspects of importance sampling in parameter selection for neural networks using ridgelet transform

Homma, Hikaru, Ohkubo, Jun

arXiv.org Artificial Intelligence

The choice of parameters in neural networks is crucial in the performance, and an oracle distribution derived from the ridgelet transform enables us to obtain suitable initial parameters. In other words, the distribution of parameters is connected to the integral representation of target functions. The oracle distribution allows us to avoid the conventional backpropagation learning process; only a linear regression is enough to construct the neural network in simple cases. This study provides a new look at the oracle distributions and ridgelet transforms, i.e., an aspect of importance sampling. In addition, we propose extensions of the parameter sampling methods. We demonstrate the aspect of importance sampling and the proposed sampling algorithms via one-dimensional and high-dimensional examples; the results imply that the magnitude of weight parameters could be more crucial than the intercept parameters.


Constructive Universal Approximation Theorems for Deep Joint-Equivariant Networks by Schur's Lemma

Sonoda, Sho, Hashimoto, Yuka, Ishikawa, Isao, Ikeda, Masahiro

arXiv.org Machine Learning

We present a unified constructive universal approximation theorem covering a wide range of learning machines including both shallow and deep neural networks based on the group representation theory. Constructive here means that the distribution of parameters is given in a closed-form expression (called the ridgelet transform). Contrary to the case of shallow models, expressive power analysis of deep models has been conducted in a case-by-case manner. Recently, Sonoda et al. (2023a,b) developed a systematic method to show a constructive approximation theorem from scalar-valued joint-group-invariant feature maps, covering a formal deep network. However, each hidden layer was formalized as an abstract group action, so it was not possible to cover real deep networks defined by composites of nonlinear activation function. In this study, we extend the method for vector-valued joint-group-equivariant feature maps, so to cover such real networks.


A unified Fourier slice method to derive ridgelet transform for a variety of depth-2 neural networks

Sonoda, Sho, Ishikawa, Isao, Ikeda, Masahiro

arXiv.org Machine Learning

To investigate neural network parameters, it is easier to study the distribution of parameters than to study the parameters in each neuron. The ridgelet transform is a pseudo-inverse operator that maps a given function $f$ to the parameter distribution $\gamma$ so that a network $\mathtt{NN}[\gamma]$ reproduces $f$, i.e. $\mathtt{NN}[\gamma]=f$. For depth-2 fully-connected networks on a Euclidean space, the ridgelet transform has been discovered up to the closed-form expression, thus we could describe how the parameters are distributed. However, for a variety of modern neural network architectures, the closed-form expression has not been known. In this paper, we explain a systematic method using Fourier expressions to derive ridgelet transforms for a variety of modern networks such as networks on finite fields $\mathbb{F}_p$, group convolutional networks on abstract Hilbert space $\mathcal{H}$, fully-connected networks on noncompact symmetric spaces $G/K$, and pooling layers, or the $d$-plane ridgelet transform.


Joint Group Invariant Functions on Data-Parameter Domain Induce Universal Neural Networks

Sonoda, Sho, Ishi, Hideyuki, Ishikawa, Isao, Ikeda, Masahiro

arXiv.org Machine Learning

The symmetry and geometry of input data are considered to be encoded in the internal data representation inside the neural network, but the specific encoding rule has been less investigated. In this study, we present a systematic method to induce a generalized neural network and its right inverse operator, called the ridgelet transform, from a joint group invariant function on the data-parameter domain. Since the ridgelet transform is an inverse, (1) it can describe the arrangement of parameters for the network to represent a target function, which is understood as the encoding rule, and (2) it implies the universality of the network. Based on the group representation theory, we present a new simple proof of the universality by using Schur's lemma in a unified manner covering a wide class of networks, for example, the original ridgelet transform, formal deep networks, and the dual voice transform. Since traditional universality theorems were demonstrated based on functional analysis, this study sheds light on the group theoretic aspect of the approximation theory, connecting geometric deep learning to abstract harmonic analysis. Keywords: ridgelet transform, universality, joint group invariant function, Schur's lemma